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Abstract — In this paper, we propose a new class of bit flipping 
algorithms for low-density parity-check (LDPC) codes over the 
binary symmetric channel (BSC). Compared to the regular (par- 
allel or serial) bit flipping algorithms, the proposed algorithms 
employ one additional bit at a variable node to represent its 
"strength." The introduction of this additional bit increases the 
guaranteed error correction capability by a factor of at least 2. An 
additional bit can also be employed at a check node to capture 
information which is beneficial to decoding. A framework for 
failure analysis of the proposed algorithms is described. These 
algorithms outperform the Gallager A/B algorithm and the min- 
sum algorithm at much lower complexity. Concatenation of two- 
bit bit flipping algorithms show a potential to approach the 
performance of belief propagation (BP) decoding in the error 
floor region, also at lower complexity. 

I. Introduction 

High speed communication systems such as flash memory, 
optical communication and free space optics require extremely 
fast and low complexity error correcting schemes. Among 
existing decoding algorithms for LDPC codes [1 1 on the BSC, 
the bit flipping (serial or parallel) algorithms are least complex 
yet possess desirable error correcting abilities. First described 
by Gallager [IJ, the parallel bit flipping algorithm was shown 
by Zyablov and Pinsker Q to be capable of asymptotically 
correcting a linear number of errors (in the code length) for 
almost all codes in the regular ensemble with left-degree 7 > 5. 
Later, Sipser and Spielman |3 1 used expander graph arguments 
to show that this algorithm and the serial bit flipping algorithm 
can correct a linear number of errors if the underlying Tanner 
graph is a good expander Note that their arguments also apply 
for regular codes with left-degree 7 > 5. It was then recently 
shown by Burshtein |4| that regular codes with left-degree 
7 = 4 are also capable of correcting a linear number of errors 
under the parallel bit flipping algorithm. 

Despite being theoretically valuable, the above-mentioned 
capability to correct a linear number of errors is not practically 
attractive. This is mainly because the fraction of correctable 
errors is extremely small and hence the code length must be 
large. Besides, the above-mentioned results do not apply for 
column-weight-three codes, which allow very low decoding 
complexity. Also, compared to hard decoding message passing 
algorithms such as the Gallager A/B algorithm, the error 
performance of the bit flipping algorithms on finite length 
codes is usually inferior This drawback is especially visible 
for column-weight-three codes for which the guaranteed error 
correction capability is upper-bounded by [5/4] - 1 (to be 



discussed later), where g is the girth of a code. The fact 
that a code with g = 6 or g = 8 can not correct certain error 
patterns of weight two indeed makes the algorithm impractical 
regardless of its low complexity. 

In recent years, numerous bit-flipping-oriented decoding 
algorithms have been proposed (see |5 1 for a list of references). 
However, almost all of these algorithms require some soft in- 
formation from a channel with capacity larger than that of the 
BSC. A few exceptions include the probabilistic bit flipping 
algorithm (PBFA) proposed by Miladinovic and Fossorier [6|. 
In that algorithm, whenever the number of unsatisfied check 
nodes suggests that a variable (bit) node should be flipped, 
it is flipped with some probability p < 1 rather than being 
flipped automatically. This random nature of the algorithm 
slows down the decoding, which was demonstrated to be 
helpful in practical codes whose Tanner graphs contain cycles. 
The idea of slowing down the decoding can also be found in 
a bit flipping algorithm proposed by Chan and Kschischang 
IjTJ. This algorithm, which is used on the additive white 
Gaussian noise channel (AWGNC), requires a certain number 
of decoding iterations between two possible flips of a variable 
node. 

In this paper, we propose a new class of bit flipping 
algorithms for LDPC codes on the BSC. These algorithms 
are designed in the same spirit as the class of finite alphabet 
iterative message passing algorithms |[8|. In the proposed 
algorithms, an additional bit is introduced to represent the 
strength of a variable node. Given a combination of satisfied 
and unsatisfied check nodes, the algorithm may reduce the 
strength of a variable node before flipping it. An additional bit 
can also be introduced at a check node to indicate its reliability. 
The novelty of these algorithms is three-fold. First, similar to 
the above-mentioned PBFA, our class of algorithms also slows 
down the decoding. However they only do so when necessary 
and in a deterministic manner. Second, their deterministic 
nature and simplicity allow simple and thorough analysis. All 
subgraphs up to a certain size on which an algorithm fails to 
converge can be found by a recursive algorithm. Consequently, 
the guaranteed error correction capability of a code with such 
algorithms can be derived. Third, the failure analysis of an 
algorithm gives rise to better algorithms. More importantly, 
it leads to decoders which use a concatenation of two-bit bit 
flipping algorithms. These decoders show excellent trade offs 
between complexity and performance. 

The rest of the paper is organized as follows. Section |ll] 



provides preliminaries. Section III motivates and describes 
the class of two-bit bit flipping algorithms. Section IV gives 
a framework to analyze these algorithms. Finally, numerical 
results are presented in Section |V] along with discussion. 

II. Preliminaries 

Let C denote an {n, k) LDPC code over the binary field 
GF(2). C is defined by the null space of H, an m x n parity 
check matrix. H is the bi-adjacency matrix of G, a Tanner 
graph representation of C. G is a bipartite graph with two 
sets of nodes: n variable nodes and m check nodes. In a 
7-left-regular code, all variable nodes have degree 7. Each 
check node imposes a constraint on the neighboring variable 
nodes. A check node is said to be satisfied by a setting of 
variable nodes if the modulo-two sum of its neighbors is zero, 
otherwise it is unsatisfied. A vector v = {vi,V2t ■ ■ ,Vn} is 
a codeword if and only if all check nodes are satisfied. The 
length of the shortest cycle in the Tanner graph G is called 
the girth g of G. 

In this paper, we consider 3-left-regular LDPC codes with 
girth g = 8, although the class of two-bit bit flipping algorithms 
can be generalized to decode any LDPC code. We assume 
transmission over the BSC. A variable node is said to be 
corrupt if it is different from its original sent value, otherwise 
it is correct. Throughout the paper, we also assume without 
loss of generality that the all-zero codeword is transmitted. Let 
y = {yii 2/2, • • • , ?/n} denote the input to an iterative decoder 
With the all-zero codeword assumption, the support of y, 
denoted as supp(y) is simply the set of variable nodes initially 
corrupt. In our case, a variable node is corrupt if it is 1 and 
is correct if it is 0. 

A simple hard decision decoding algorithm for LDPC codes 
on the BSC, known as the parallel bit flipping algorithm 
||2), ||3) is defined as follows. For any variable node u in a 
Tanner graph G, let n'c^{v) and Uc'^ {v) denote the number 
of satisfied check nodes and unsatisfied check nodes that are 
connected to v, respectively. 



Algorithm Parallel Bit Flipping Algorithm 



In parallel, flip each variable node v if ni^^ {v) > n)f' (v) 
Repeat until all check nodes are satisfied. 



III. The class of two-bit bit flipping algorithms 

The class of two-bit bit flipping algorithms is described in 
this section. We start with two motivating examples. The first 
one illustrates the advantage of an additional bit at a variable 
node while the second illustrates the advantage at a check 
node. 

A. First Motivating Example: Two-bit Variable Nodes 

In this subsection, symbols O and • denote a correct and a 
corrupt variable node while □ and ■ denote a satisfied and an 
unsatisfied check node. Let C be a 3-left-regular LDPC code 
with girth 5 = 8 and assume that the variable nodes vi,V2,V3 
and V4 form an eight cycle as shown in Fig. [T] Also assume 
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Fig. 1. Weight-two en'or configurations uncoirectable by the parallel bit 
flipping algorithm. 



that only vi and are initially in error and that the parallel bit 
flipping algorithm is employed. In the first iteration illustrated 



in Fig. 1 'a) €1,02,03,04,05 and 07 are unsatisfied while oe 
Since nc"^(wi) = ni^\v3) = 3 and 
0, vi and V3 are flipped and become cor- 



and 08 are satisfied. 
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rect. However, V2 and V4 are also flipped and become incorrect 
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smce n, 

In the second iteration (Fig. |l|[b)| i, the algorithm again flips 
vi,V2,V3 and V4. It can be seen that the set of corrupt variable 
nodes alternates between {wi,W3} and {v2,V4}, and thus the 
algorithm does not converge. 

The parallel bit flipping algorithm fails in the above situation 
because it uses the same treatment for variable nodes with 
ni^'^\v) = 3 and ni"\v) = 2. The algorithm is too "aggres- 
sive" when flipping a variable node v with ni"\v) = 2. Let us 
consider a modified algorithm which only flips a variable node 
V with rii^\v) = 3. This modified algorithm will converge in 
the above situation. However, if only vi and V2 are initially in 



error (Fig. 1 c) 1 then the modified algorithm does not converge 
because it does not flip any variable node. The modified 
algorithm is now too "cautious" to flip a variable node v with 

Both decisions (to flip and not to flip) a variable node v with 
ni"\v) = 2 can lead to decoding failure. However, we must 
pick one or the other due the assumption that a variable node 
takes its value from the set {0,1}. Relaxing this assumption 
is therefore required for a better bit flipping algorithm. 

Let us now assume that a variable node can take four values 
instead of two. Specifically, a variable node takes its value 
from the set Av = {Og, 0^,, 1^,, Is}, where Og (1^) stands for 
"strong zero" ("strong one") and 0^, (!„,) stands for "weak 
zero" ("weak one"). Assume for now that a check node only 
sees a variable node either as if the variable node is Os or 0,^,, 
or as 1 if the variable node is or 1^. Recall that ni"\v) e 
{0,1,2,3} is the number of unsatisfied check nodes that are 
connected to the variable node v. Let /i : Av x {0, 1, 2, 3} ^■ 
Ay be the function defined in Table |l] 

TABLE I 
fi-.AvX {0,1,2,3} ^ Av 
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Consider the following bit flipping algorithm. 

Algorithm 1 Two-bit Bit Flipping Algorithm 1 (TBFAl) 
Initialization: Each variable node v is initialized to Os if Ui, = 
and is initialized to 1^ if y„ = 1. 

• In parallel, flip each variable node v to fi{v,ni'^\v)). 

• Repeat until all check nodes are satisfied. 

Compared to the parallel bit flipping algorithm and its 
modified version discussed above, the TBFAl possesses a 
gentler treatment for a variable node v with nc{v) = 2. It 
tries to reduce the "strength" of v before flipping it. One may 
realize at this point that it is rather imprecise to say that the 
TBFAl flips a variable node v from Os to 0^, or vice versa, 
since a check node still sees v as 0. However, as the values 
of V can be represented by two bits, i.e., Av can be mapped 
onto the alphabet {01,00, 10, 11}, the flipping of v should be 
understood as either the flipping of one bit or the flipping of 
both bits. 

It is easy to verify that the TBFAl is capable of correcting 
the error configurations shown in Fig. [T] Moreover, the guar- 
anteed correction capability of this algorithm is given in the 
following proposition. 

Proposition 1: The TBFAl is capable of correcting any 
error pattern with up to 5/2- 1 errors in a left-regular column- 
weight- three code with Tanner graph G which has girth <? < 12 
and which does not contain any codeword of weight w < g. 
Proof: The proof is omitted due to page limits. ■ 

Remarks: 

• It can be shown that the guaranteed error correction capa- 
bility of a 3-left-regular code with the parallel bit flipping 
algorithm is strictly less than [|]. Thus, the TBFAl 
increases the guaranteed error correction capability by a 
factor of at least 2. 

• In fOl, we have shown that the Gallager A/B algorithm is 
capable of correcting any error pattern with up to g/2 - 1 
errors in a 3-left-regular code with girth g > 10. For codes 
with girth g = 8 and minimum distance dmin > 8, the 
Gallager A/B algorithm can only correct up to two errors. 
This means that the guaranteed error correction capability 
of the TBFAl is at least as good as that of the Gallager 
A/B algorithm (and better for codes with g = 8). It is also 
not difficult to see that the complexity of the TBFAl is 
much lower than that of the Gallager A/B algorithm. 

Now that the advantage of having more than one bit to 
represent the values of a variable node is clear, let us explore 
the possibility of using more than one bit to represent the 
values of a check node in the next subsection. 

B. Second Motivating Example: Two-bit Check Nodes 

In this subsection, we use the symbols O and • to denote 
a Os variable node and a Ig variable node, respectively. The 
symbols used to denote a 0,„ variable node and a l^u variable 




Fig. 2. The decoding of tlie two-bit parallel bit flipping algoiithm 1 on a 
weight-four error configuration. 



and vi is a 1„ variable node . The symbols □ and ■ 
represent a satisfied and an unsatisfied check node. 
Assume a decoder that uses the TBFAl algorithm. Fig. 



still 
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node are shown in Fig. 1 b) where V2 is a 0^, variable node 



(b) and (c) illustrates the first, second and third decoding 
iteration of the TBFAl on an error configuration with four 
variable nodes vi^v^,Vi and vq that are initially in error. We 
assume that all variable nodes which are not in this subgraph 
remain correct during decoding and will not be referred to. In 
the first iteration, variable nodes f i, W2, t^s, and are strong 
and connected to two unsatisfied check nodes. Consequently, 
the TBFAl reduces their strength. Since variable nodes 
and t;4 are strong and only connected to one unsatisfied check 
node, their values are not changed. In the second iteration, all 
check nodes retain their values (satisfied or unsatisfied) from 
the first iteration. The TBFAl hence flips vi and from 1„, to 
Os and flips V2tV^ and from 0^ to Is. At the beginning of 
the third iteration, the value of any variable node is either Os 
or Is. Every variable node is connected to two satisfied check 
nodes and one unsatisfied check node. Since no variable node 
can change its value, the algorithm fails to converge. 

The failure of the TBFAl to correct this error configuration 
can be attributed to the fact that check node C3 is connected to 
two initially erroneous variable nodes and v^, consequently 
preventing them from changing their values. Let us slightly 
divert from our discussion and revisit the PBFA proposed 
by Miladinovic and Fossorier f6l. The authors observed that 
variable node estimates corresponding to a number close to 
[7/2] unsatisfied check nodes are unreliable due to multiple 
errors, cycles in the code graph and equally likely a priori 
hard decisions. Based on this observation, the PBFA only 
flips a variable node with some probability p < 1. In the 
above error configuration, a combination of two unsatisfied 
and one satisfied check nodes would be considered unreliable. 
Therefore, the PBFA would flip the corrupt variable nodes 
vi and vq as well as the correct variable node W2,i'5 and 

with the same probability p. However, one can see that a 
combination of one unsatisfied and two satisfied check nodes 
would also be unreliable because such combination prevents 
the corrupt variable nodes W3 and from being corrected. 
Unfortunately, the PBFA can not flip variable nodes with less 
than 7/2 unsatisfied check nodes since many other correct 




Fig. 3. Possible values and transition of a check node. 



variable nodes in the Tanner graph would also be flipped. 
In other words, the PBFA can not evaluate the reliability of 
estimates corresponding to a number close to [7/2J unsatisfied 
check nodes. We demonstrate that such reliability can be 
evaluated with a new concept introduced below. 

Revisit the decoding of the TBFAl on the error configu- 
ration illustrated in Fig. [2] Notice that in the third iteration, 
except check node C3, all check nodes that are unsatisfied in 
the second iteration become satisfied while all check nodes 
that are satisfied in the second iteration become unsatisfied. 
We will provide this information to the variable nodes. 

Definition 1: A satisfied (unsatisfied) check node is called 
previously satisfied (previously unsatisfied) if it was satisfied 
(unsatisfied) in the previous decoding iteration, otherwise it is 
called newly satisfied {newly unsatisfied). 

The possible transitions of a check node are illustrated in 
Fig. [3] Let ni'''\v), ni'"\v) and nf^\v) be 

the number of previously satisfied check nodes, previously 
unsatisfied check nodes, newly satisfied check nodes and 
newly unsatisfied check nodes that are connected to a variable 
node V, respectively. Let /2 : Av x {0,1,2,3}'^ Ay be a 
function defined as follows: 

f2{v,x,y,z) = fi{v,x + y) if {x,y,z) i {(0, 1, 2), (0, 1, 1)}, 
f2{v, 0, 1, 2) = V, /2(0„ 0, 1, 1) = /2(0„, 0, 1, 1) = 0^, 

/2(1.,0,1,1)=/2(U, 0,1,1) = !^. 

Consider the following bit flipping algorithm: 



Algorithm 2 Two-bit Bit Flipping Algorithm 2 (TBFA2) 
Initialization: Each variable node v is initialized to Os if yy = 
and is initialized to 1^ if y^ = 1. In the first iteration, check 
nodes are either previously satisfied or previously unsatisfied. 

• In parallel, flip each variable node v to 

• Repeat until all check nodes are satisfied. 



The TBFA2 considers a combination of one newly unsat- 
isfied, one newly satisfied and one previously satisfied check 
node to be less reliable than a combination of one previously 
unsatisfied and two previously satisfied check nodes. There- 
fore, it will reduce the strength of 113 and at the end of the 
third iteration. Consequently, the error configuration shown in 
Fig. |2] can now be corrected after 9 iterations. Proposition [T] 
also holds for the TBFA2. 

Remarks: Let Q be the set of all functions from Ay x 
{0, 1, 2, 3} Ay A natural question to ask is whether /i can 



be replaced with some f[eQ such that the TBFAl algorithm 
can correct the error configuration shown in Fig. |2] Brute force 
search reveals many of such functions. Unfortunately, none of 
those functions allow the algorithm to retain its guaranteed 
error correction capability stated in Proposition [T] 

We recap this section by giving the formal definition of the 
class of two-bit bit flipping algorithms. 

Definition 2: For the class of two-bit bit flipping algo- 
rithms, a variable node v takes its value from the set Ay = 
{Qs,OwAiuAs}- A check node sees a Og and a Oy, variable 
node as and sees a and a variable node as 1. According 
to Definition [T] a check node can be previously satisfied, 
previously unsatisfied, newly satisfied or newly unsatisfied. An 
algorithm T is defined by a mapping / : Ay x{0,l,...,7}"^ 
Ay, where 7 is the column-weight of a code. 

Different algorithms in this class are specified by different 
functions /. In order to evaluate the performance of an 
algorithm, it is necessary to analyze its failures. To that task 
we shall now proceed. 

IV. A FRAMEWORK FOR FAILURE ANALYSIS 

In this section, we describe a framework for the analysis 
of two-bit bit flipping algorithms (the details will be provided 
in the journal version of this paper). Consider the decoding 
of a two-bit bit flipping algorithm on a Tanner graph G. 
Assume a maximum number of / iterations and assume that 
the channel makes k errors. Let / denote the subgraph induced 
by the k variable nodes that are initially in error. Let S be the 
set of all Tanner graphs that contain /. Let iSe be the subset 
of S with the following property: if 5* € iSe then there exists 
an induced subgraph J of S* such that (i) J is isomorphic to 
/ and (ii) the two-bit bit flipping algorithm fails to decode 
on 5* after I iterations if the k initially corrupt variable nodes 
are variable nodes in J. Let be a subset of Se such that 
any graph Si e Se contains a graph ^2 e and no graph in 
iSg contains another graph in S^- With the above formulation, 
we give the following proposition. 

Proposition 2: Algorithm will converge on G after I 
decoding iterations if the induced subgraph / is not contained 
in any induced subgraph K of G that is isomorphic to a graph 

in s:. 

Proof: If JF fails to converge on G after I iterations then 
G e Se, hence / must be contained in an induced subgraph K 
of G that are isomorphic to a graph in 5g. ■ 

We remark that Proposition |2] only gives a sufficient con- 
dition. This is because K might be contained in an induced 
subgraph of G that is not isomorphic to any graph in Se- 
Nevertheless, 5J can still be used as a benchmark to evaluate 
the algorithm JF. A better algorithm should allow the above 
sufficient condition to be met with higher probability. For a 
more precise statement, we give the following. 

Proposition 3: The probability that a Tanner graph / is 
contained in a Tanner graph Ki with fci variable nodes is 
less than the probability that / is contained in a Tanner graph 
K2 with fc2 variable nodes if fci > /c2 



( s) 

Proof: Let K2 be a Tanner graph with ki variable nodes 
such that contains K2- Since Ki and K^''^ both have ki 
variable nodes, the probability that / is contained in Ki equals 
the probability that / is contained in Ki'^\ On the other hand, 

(s) 

since ' contains K2, the probability that / is contained in 

(s) 

K2 is less than the probability that / is contained in K2 by 
conditional probability. ■ 
Proposition |3] suggests that a two-bit bit flipping algorithm 
should be chosen to maximize the size (in terms of number 
of variable nodes) of the smallest Tanner graph in S^. Given 
an algorithm JF, one can find all graphs in 5p up to a certain 

(i) 

number of variable nodes by a recursive algorithm. Let Ve 
denote the set of corrupt variable nodes at the beginning of the 
i-th iteration. The algorithm starts with the subgraph /, which 
is induced by the variable nodes in vj^K Let M{v}^^) be the 
set of check nodes that are connected to at least one variable 

(i) 

node in Ve ■ In the first iteration, only the check nodes in 
N'{Vy^) can be unsatisfied. Therefore, if a correct variable 
node becomes corrupt at the end of the first iteration then it 
must connect to at least one check node in J\f{vj^^). In all 
possible ways, the algorithm then expands / recursively by 
adjoining new variable nodes such that these variable nodes 
become corrupt at the end of the first iteration. The recursive 
introduction of new variable nodes halts if a graph in is 
found. Let 71 be the set of graphs obtained by expanding /. 
Each graph in 7i\5J is then again expanded by adjoining 
new variable nodes that become corrupt at the end of the 
second iteration. This process is repeated I times where I is 
the maximum number of iterations. 

V. Numerical results and Discussion 

We demonstrate the performance of two-bit bit flipping al- 
gorithms on a regular column-weight-three quasi-cyclic LDPC 
code of length n = 768. The code has rate R = 0.75 and 
minimum distance d^in = 12. Two different decoders are 
considered. The first decoder, denoted as BFDl, employs a 
single two-bit bit flipping algorithm. The BFDl may perform 
iterative decoding for a maximum number of 30 iterations. 
The second decoder, denoted as BFD2, is a concatenation of 55 
algorithms, namely JF^, JF2, . . . Associated with algorithm 
JFj is a maximum number of iterations li. The BFD2 operates 
by performing decoding using algorithm Ti on an input vector 
y for i = 1,2,..., 55 or until a codeword is found. The 
maximum possible number of decoding iterations performed 
by the BFD2 is T.i=ik = 1950. Details on the algorithms 
T\,T2t ■ ■ ^55 as well as the parity check matrix of the quasi- 
cyclic LDPC code can be found in | .10J . 

Simulations for frame error rate (FER) are shown in Fig. 
|4] Both decoders outperform decoders which use the Gallager 
A/B algorithm or the min-sum algorithm. In particular, the 
FER performance of the BFD2 is significantly better. More 
importantly, the slope of the FER curve of the BFD2 is larger 
than that of the BP decoder This shows the potential of two- 
bit bit flipping decoders with comparable or even better error 
floor performance than that of the BP decoder. It is also 
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Fig. 4. Frame error rate performance of the BFDl and BFD2. 
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Frame Error Rate (FER) 
Fig. 5. Average number of decoding iterations per output word. 



important to remark that although the BFD2 uses 55 different 
decoding algorithms, at cross over probability a < 0.0025, 
more than 99.99% of codewords are decoded by the first 
algorithm. Consequently, the average number of iterations per 
output word of the BFD2 is not much higher than that of the 
BFDl, as illustrated in Fig. |5] This means that similar to the 
BFDl, the BFD2 has an extremely high speed. 
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